AITopics | Burlington

Collaborating Authors

Burlington

Optimized Architectures for Kolmogorov-Arnold Networks

arXiv.org Machine LearningDec-16-2025

Efforts to improve Kolmogorov-Arnold networks (KANs) with architectural enhancements have been stymied by the complexity those enhancements bring, undermining the interpretability that makes KANs attractive in the first place. Here we study overprovisioned architectures combined with sparsification to learn compact, interpretable KANs without sacrificing accuracy. Crucially, we focus on differentiable sparsification, turning architecture search into an end-to-end optimization problem. Across function approximation benchmarks, dynamical systems forecasting, and real-world prediction tasks, we demonstrate competitive or superior accuracy while discovering substantially smaller models. Overprovisioning and sparsification are synergistic, with the combination outperforming either alone. The result is a principled path toward models that are both more expressive and more interpretable, addressing a key tension in scientific machine learning.

activation function, architecture, sparsification, (14 more...)

arXiv.org Machine Learning

2512.12448

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Softly Symbolifying Kolmogorov-Arnold Networks

Bagrow, James, Bongard, Josh

arXiv.org Machine LearningDec-10-2025

Kolmogorov-Arnold Networks (KANs) offer a promising path toward interpretable machine learning: their learnable activations can be studied individually, while collectively fitting complex data accurately. In practice, however, trained activations often lack symbolic fidelity, learning pathological decompositions with no meaningful correspondence to interpretable forms. We propose Softly Symbolified Kolmogorov-Arnold Networks (S2KAN), which integrate symbolic primitives directly into training. Each activation draws from a dictionary of symbolic and dense terms, with learnable gates that sparsify the representation. Crucially, this sparsification is differentiable, enabling end-to-end optimization, and is guided by a principled Minimum Description Length objective. When symbolic terms suffice, S2KAN discovers interpretable forms; when they do not, it gracefully degrades to dense splines. We demonstrate competitive or superior accuracy with substantially smaller models across symbolic benchmarks, dynamical systems forecasting, and real-world prediction tasks, and observe evidence of emergent self-sparsification even without regularization pressure.

activation function, representation, s2kan, (12 more...)

arXiv.org Machine Learning

2512.07875

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Materials > Construction Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)

Add feedback

The Seeds of Scheming: Weakness of Will in the Building Blocks of Agentic Systems

Yang, Robert

arXiv.org Artificial IntelligenceDec-8-2025

Large language models display a peculiar form of inconsistency: they "know" the correct answer but fail to act on it. In human philosophy, this tension between global judgment and local impulse is called akrasia, or weakness of will. We propose akrasia as a foundational concept for analyzing inconsistency and goal drift in agentic AI systems. To operationalize it, we introduce a preliminary version of the Akrasia Benchmark, currently a structured set of prompting conditions (Baseline [B], Synonym [S], Temporal [T], and Temptation [X]) that measures when a model's local response contradicts its own prior commitments. The benchmark enables quantitative comparison of "self-control" across model families, decoding strategies, and temptation types. Beyond single-model evaluation, we outline how micro-level akrasia may compound into macro-level instability in multi-agent systems that may be interpreted as "scheming" or deliberate misalignment. By reframing inconsistency as weakness of will, this work connects agentic behavior to classical theories of agency and provides an empirical bridge between philosophy, psychology, and the emerging science of agentic AI.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2512.05449

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Vermont > Chittenden County > Burlington (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Law (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)

Add feedback

Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings

Prama, Tabia Tanzin, Danforth, Christopher M., Dodds, Peter Sheridan

arXiv.org Artificial IntelligenceDec-3-2025

Recent advances enable Large Language Models (LLMs) to generate AI personas, yet their lack of deep contextual, cultural, and emotional understanding poses a significant limitation. This study quantitatively compared human responses with those of eight LLM-generated social personas (e.g., Male, Female, Muslim, Political Supporter) within a low-resource environment like Bangladesh, using culturally specific questions. Results show human responses significantly outperform all LLMs in answering questions, and across all matrices of persona perception, with particularly large gaps in empathy and credibility. Furthermore, LLM-generated content exhibited a systematic bias along the lines of the ``Pollyanna Principle'', scoring measurably higher in positive sentiment ($Φ_{avg} = 5.99$ for LLMs vs. $5.60$ for Humans). These findings suggest that LLM personas do not accurately reflect the authentic experience of real people in resource-scarce environments. It is essential to validate LLM personas against real-world human data to ensure their alignment and reliability before deploying them in social science research.

large language model, machine learning, persona, (17 more...)

arXiv.org Artificial Intelligence

2512.02058

Country:

Asia > Bangladesh (0.29)
North America > United States > Vermont > Chittenden County > Burlington (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

LLMs for Low-Resource Dialect Translation Using Context-Aware Prompting: A Case Study on Sylheti

Prama, Tabia Tanzin, Danforth, Christopher M., Dodds, Peter Sheridan

arXiv.org Artificial IntelligenceDec-1-2025

Large Language Models (LLMs) have demonstrated strong translation abilities through prompting, even without task-specific training. However, their effectiveness in dialectal and low-resource contexts remains underexplored. This study presents the first systematic investigation of LLM-based machine translation (MT) for Sylheti, a dialect of Bangla that is itself low-resource. We evaluate five advanced LLMs (GPT-4.1, GPT-4.1, LLaMA 4, Grok 3, and DeepSeek V3.2) across both translation directions (Bangla $\Leftrightarrow$ Sylheti), and find that these models struggle with dialect-specific vocabulary. To address this, we introduce Sylheti-CAP (Context-Aware Prompting), a three-step framework that embeds a linguistic rulebook, a dictionary (2{,}260 core vocabulary items and idioms), and an authenticity check directly into prompts. Extensive experiments show that Sylheti-CAP consistently improves translation quality across models and prompting strategies. Both automatic metrics and human evaluations confirm its effectiveness, while qualitative analysis reveals notable reductions in hallucinations, ambiguities, and awkward phrasing, establishing Sylheti-CAP as a scalable solution for dialectal and low-resource MT. Dataset link: \href{https://github.com/TabiaTanzin/LLMs-for-Low-Resource-Dialect-Translation-Using-Context-Aware-Prompting-A-Case-Study-on-Sylheti.git}{https://github.com/TabiaTanzin/LLMs-for-Low-Resource-Dialect-Translation-Using-Context-Aware-Prompting-A-Case-Study-on-Sylheti.git}

large language model, machine learning, translation, (18 more...)

arXiv.org Artificial Intelligence

2511.21761

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Computational frame analysis revisited: On LLMs for studying news coverage

Kunjar, Sharaj, Smith, Alyssa Hasegawa, Mckenzie, Tyler R, Mohbe, Rushali, Scarpino, Samuel V, Welles, Brooke Foucault

arXiv.org Artificial IntelligenceNov-25-2025

Computational approaches have previously shown various promises and pitfalls when it comes to the reliable identification of media frames. Generative LLMs like GPT and Claude are increasingly being used as content analytical tools, but how effective are they for frame analysis? We address this question by systematically evaluating them against their computational predecessors: bag-of-words models and encoder-only transformers; and traditional manual coding procedures. Our analysis rests on a novel gold standard dataset that we inductively and iteratively developed through the study, investigating six months of news coverage of the US Mpox epidemic of 2022. While we discover some potential applications for generative LLMs, we demonstrate that they were consistently outperformed by manual coders, and in some instances, by smaller language models. Some form of human validation was always necessary to determine appropriate model choice. Additionally, by examining how the suitability of various approaches depended on the nature of different tasks that were part of our frame analytical workflow, we provide insights as to how researchers may leverage the complementarity of these approaches to use them in tandem. We conclude by endorsing a methodologically pluralistic approach and put forth a roadmap for computational frame analysis for researchers going forward.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.17746

Country:

North America > United States > Minnesota (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(5 more...)

Add feedback

Multi-Armed Bandits with Metric Movement Costs

Tomer Koren, Roi Livni, Yishay Mansour

Neural Information Processing SystemsNov-21-2025, 12:37:12 GMT

We consider the non-stochastic Multi-Armed Bandit problem in a setting where there is a fixed and known metric on the action space that determines a cost for switching between any pair of actions.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e9a612969b4df241ff0d8273656bd5a4-Paper-Conference.pdf

Neural Information Processing SystemsNov-19-2025, 23:34:26 GMT

We propose a novel and efficient search algorithm which finds initial centers that can be used subsequently for the local search algorithm.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(17 more...)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Higher-Order Responsibility

Jiang, Junli, Naumov, Pavel

arXiv.org Artificial IntelligenceNov-13-2025

In ethics, individual responsibility is often defined through Frankfurt's principle of alternative possibilities. This definition is not adequate in a group decision-making setting because it often results in the lack of a responsible party or "responsibility gap''. One of the existing approaches to address this problem is to consider group responsibility. Another, recently proposed, approach is "higher-order'' responsibility. The paper considers the problem of deciding if higher-order responsibility up to degree $d$ is enough to close the responsibility gap. The main technical result is that this problem is $Π_{2d+1}$-complete.

artificial intelligence, game theory, responsibility gap, (16 more...)

arXiv.org Artificial Intelligence

2506.01003

Country:

Asia > China (0.04)
North America > United States > Vermont > Chittenden County > Burlington (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Game Theory (0.68)

Add feedback

Transformer-Based Low-Resource Language Translation: A Study on Standard Bengali to Sylheti

Oni, Mangsura Kabir, Prama, Tabia Tanzin

arXiv.org Artificial IntelligenceOct-23-2025

WORK Although the findings highlight the effectiveness of fine - tuned transformer models for Bengali - Sylheti translation, several limitations remain. The dataset size (5,002 parallel sentences) restricts the models' capacity to generalize across diverse syntactic structures, stylistic variations, and domain - specific expressions. In addition, orthographic inconsistencies in Sylheti introduce noise, leading to training instability, particularly in models like mBART - 50. Another limitation is the reliance on automatic evaluation metrics such as BLEU and chrF, which may not fully capture the linguistic richness or cultural nuance of Sylheti. Future research should therefore focus on expanding the datas et through community - driven contributions and data augmentation strategies. Incorporating orthographic normalization could improve consistency and reduce variability during training. Hybrid approaches that combine the strengths of pre - trained LLMs with fin e - tuned NMT models may also enhance translation robustness in low - resource settings. Finally, incorporating human evaluation will provide a more comprehensive assessment of translation adequacy, fluency, and cultural alignment.

large language model, machine learning, translation, (20 more...)

arXiv.org Artificial Intelligence

2510.18898

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
Asia > Bangladesh (0.05)
Asia > India (0.04)
(2 more...)

Genre:

Overview (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback